Dynamic variance adaptation using differenced maximum mutual information

نویسندگان

Marc Delcroix

Atsunori Ogawa

Tomohiro Nakatani

Atsushi Nakamura

چکیده

A conventional approach for noise robust automatic speech recognition consists of using a speech enhancement before recognition. However, speech enhancement cannot completely remove noise, thus a mismatch between the enhanced speech and the acoustic model inevitably remains. Uncertainty decoding approaches have been used to mitigate such a mismatch by accounting for the feature uncertainty during decoding. We have proposed dynamic variance adaptation to estimate the feature uncertainty given adaptation data by maximization of likelihood or discriminative criterion such as MMI. For unsupervised adaptation, the transcriptions are obtained from a first recognition pass and thus contain errors. Such errors are fatal when using a discriminative criterion. In this paper, we investigate the recently proposed differenced MMI discriminative criterion for unsupervised dynamic variance adaptation, because it inherently includes a mechanism to mitigate the influence of errors in the transcriptions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Linear Transforms for Speaker Adaptation

Linear transform adaptation techniques such as Maximum Likelihood Linear Regression (MLLR) are a popular and effective family of methods for speaker adaptation. MLLR estimates transform parameters for Gaussian means and variances using a maximum likelihood (ML) objective function. This paper discusses the use of an alternative discriminative objective function for linear transform estimation, w...

متن کامل

Improvements in linear transform based speaker adaptation

This paper presents three forms of linear transform based speaker adaptation that can give better performance than standard maximum likelihood linear regression (MLLR) adaptation. For unsupervised adaptation, a lattice-based technique is introduced which is compared to MLLR using confidence scores. For supervised adaptation, estimation of the adaptation matrices using the maximum mutual informa...

متن کامل

Semantic Text Clusters and Word Classes – the Dualism of Mutual Information and Maximum Likelihood

Dynamically modeling the word distribution in a variety of texts is a goal with various applications. For speech recognition a dynamic unigram may efficiently be used for the adaptation of longer ranging language models. For information retrieval it may be a good starting point to predict the most characteristic words in document dependent queries. This short paper presents two approaches for a...

متن کامل

Quasi Maximum-Likelihood Estimation of Dynamic Panel Data Models

This paper establishes the almost sure convergence and asymptotic normality of levels and differenced quasi maximum-likelihood (QML) estimators of dynamic panel data models. The QML estimators are robust with respect to initial conditions, conditional and time-series heteroskedasticity, and misspecification of the log-likelihood. The paper also provides an ECME algorithm for calculating levels ...

متن کامل

The Cu-htk March 2000 Hub5e Transcription System

This paper describes the Cambridge University HTK (CU-HTK) system developed for the NIST March 2000 evaluation of English conversational telephone speech transcription (Hub5E). A range of new features have been added to the HTK system used in the 1998 Hub5 evaluation, and the changes taken together have resulted in an 11% relative decrease in word error rate on the 1998 evaluation test set. Maj...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Dynamic variance adaptation using differenced maximum mutual information

نویسندگان

چکیده

منابع مشابه

Discriminative Linear Transforms for Speaker Adaptation

Improvements in linear transform based speaker adaptation

Semantic Text Clusters and Word Classes – the Dualism of Mutual Information and Maximum Likelihood

Quasi Maximum-Likelihood Estimation of Dynamic Panel Data Models

The Cu-htk March 2000 Hub5e Transcription System

عنوان ژورنال:

اشتراک گذاری